From Czech Morphology through Partial Parsing to Disambiguation

نویسندگان

  • Eva Mráková
  • Radek Sedlácek
چکیده

This paper deals with a complex system of processing raw Czech texts. Several modules were implemented which perform different levels of processing. These modules can easily be incorporated into many other linguistic applications and some of them are already exploited in this way. The first level of processing raw texts represents a reliable morphological analysis – we give a survey of the effective implementation of the robust morphological analyser for Czech named ajka. Texts tagged by ajka can be further processed by the partial parser Dis and its extension VaDis which is based on verb valencies. The output of these systems serves for automatic partial disambiguation of input texts. The tools described in this paper are widely used for parsing large corpora and can be employed in the initial phase of semantic analysis.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Partial Word Sense Disambiguation Tools for Czech

Complex applications in natural language processing such as syntactic analysis, semantic annotation, machine translation and especially word sense disambiguation consist of several relatively simple independent tasks. Czech, belonging among Slavonic languages with many inflectional features, requires more effort for such tasks, in comparison with other languages. In this article we present two ...

متن کامل

Shallow Parsing of Czech Sentence Based on Correct Morphological Disambiguation

The basis of such an approach is provided by a very complex and sophisticated rule-based morphological disambiguation which can disambiguate Czech sentence with a very high reliability, i.e. with a minimum number of errors. This is, of course, very important for any language and all the more so for Czech whose ambiguity rate is generally extremely high (as compared e.g. to other Slavic language...

متن کامل

Morphological Analysis of Law Texts

In the paper we explore the morphology of the Czech law texts including Constitution, acts, public notices and court judgements which form a huge textual database. As many texts from small domains, the used language is partially restricted and in relevant aspects also different from general Czech. The paper presents first results of the morphological analysis of Czech law texts and their conver...

متن کامل

Morphological and Syntactic Case in Statistical Dependency Parsing

Most morphologically rich languages with free word order use case systems to mark the grammatical function of nominal elements, especially for the core argument functions of a verb. The standard pipeline approach in syntactic dependency parsing assumes a complete disambiguation of morphological (case) information prior to automatic syntactic analysis. Parsing experiments on Czech, German, and H...

متن کامل

PoS Disambiguation and Partial Parsing Bidirectional Interaction

This paper presents Latch; a system for PoS disambiguation and partial parsing that has been developed for Spanish. In this system, chunks can be recognized and can be referred to like ordinary words in the disambiguation process. This way, sentences are simplified so that the disambiguator can operate interpreting a chunk as a word and chunk head information as a word analysis. This interactio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003